Overview

Dataset statistics

Number of variables23
Number of observations5425
Missing cells11971
Missing cells (%)9.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory974.9 KiB
Average record size in memory184.0 B

Variable types

CAT15
NUM8

Warnings

centro_escolar_acceso has a high cardinality: 137 distinct values High cardinality
fecha_nacimiento has a high cardinality: 3690 distinct values High cardinality
municipio has a high cardinality: 435 distinct values High cardinality
tipo_traslado has 4525 (83.4%) missing values Missing
nota_acceso has 1578 (29.1%) missing values Missing
nota_admision_def has 2725 (50.2%) missing values Missing
centro_escolar_acceso has 2461 (45.4%) missing values Missing
cod_provincia has 162 (3.0%) missing values Missing
provincia has 162 (3.0%) missing values Missing
cod_municipio has 179 (3.3%) missing values Missing
municipio has 179 (3.3%) missing values Missing
fecha_nacimiento is uniformly distributed Uniform

Reproduction

Analysis started2021-05-14 09:05:58.059303
Analysis finished2021-05-14 09:06:06.373543
Duration8.31 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

expediente
Real number (ℝ≥0)

Distinct1718
Distinct (%)31.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean597.3996313
Minimum0
Maximum1851
Zeros3
Zeros (%)0.1%
Memory size42.4 KiB
2021-05-14T11:06:06.413417image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile24
Q1148
median458
Q3964
95-th percentile1596
Maximum1851
Range1851
Interquartile range (IQR)816

Descriptive statistics

Standard deviation508.7121391
Coefficient of variation (CV)0.8515441129
Kurtosis-0.6828524355
Mean597.3996313
Median Absolute Deviation (MAD)357
Skewness0.6934634878
Sum3240893
Variance258788.0405
MonotocityNot monotonic
2021-05-14T11:06:06.500690image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
9140.3%
 
10130.2%
 
28130.2%
 
18130.2%
 
3130.2%
 
11130.2%
 
6130.2%
 
2130.2%
 
20130.2%
 
16130.2%
 
Other values (1708)529497.6%
 
ValueCountFrequency (%) 
030.1%
 
1100.2%
 
2130.2%
 
3130.2%
 
4130.2%
 
ValueCountFrequency (%) 
18511< 0.1%
 
18471< 0.1%
 
18461< 0.1%
 
18441< 0.1%
 
18401< 0.1%
 

cod_plan
Real number (ℝ≥0)

Distinct18
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1627.792074
Minimum1621
Maximum1639
Zeros0
Zeros (%)0.0%
Memory size42.4 KiB
2021-05-14T11:06:06.578308image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1621
5-th percentile1623
Q11626
median1627
Q31632
95-th percentile1635
Maximum1639
Range18
Interquartile range (IQR)6

Descriptive statistics

Standard deviation3.639889297
Coefficient of variation (CV)0.002236089827
Kurtosis0.04872817441
Mean1627.792074
Median Absolute Deviation (MAD)1
Skewness0.8025982261
Sum8830772
Variance13.24879409
MonotocityIncreasing
2021-05-14T11:06:06.639328image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=18)
ValueCountFrequency (%) 
1626141226.0%
 
163293517.2%
 
162774013.6%
 
162361711.4%
 
162856810.5%
 
16252725.0%
 
16242474.6%
 
16351562.9%
 
16311322.4%
 
16361021.9%
 
Other values (8)2444.5%
 
ValueCountFrequency (%) 
1621170.3%
 
1622100.2%
 
162361711.4%
 
16242474.6%
 
16252725.0%
 
ValueCountFrequency (%) 
1639521.0%
 
1638290.5%
 
16361021.9%
 
16351562.9%
 
1634661.2%
 

des_plan
Categorical

Distinct16
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
GRADO EN EDIFICACIÓN
1412 
GRADO EN INGENIERÍA INFORMÁTICA EN INGENIERÍA DEL SOFTWARE
935 
GRADO EN INGENIERÍA INFORMÁTICA EN INGENIERÍA DE COMPUTADORES
740 
GRADO EN INGENIERÍA CIVIL - CONSTRUCCIONES CIVILES
617 
GRADO EN INGENIERÍA DE SONIDO E IMAGEN EN TELECOMUNICACIÓN
568 
Other values (11)
1153 
ValueCountFrequency (%) 
GRADO EN EDIFICACIÓN141226.0%
 
GRADO EN INGENIERÍA INFORMÁTICA EN INGENIERÍA DEL SOFTWARE93517.2%
 
GRADO EN INGENIERÍA INFORMÁTICA EN INGENIERÍA DE COMPUTADORES74013.6%
 
GRADO EN INGENIERÍA CIVIL - CONSTRUCCIONES CIVILES61711.4%
 
GRADO EN INGENIERÍA DE SONIDO E IMAGEN EN TELECOMUNICACIÓN56810.5%
 
GRADO EN INGENIERÍA CIVIL - TRANSPORTES Y SERVICIOS URBANOS2725.0%
 
GRADO EN INGENIERÍA CIVIL - HIDROLOGÍA2474.6%
 
MÁSTER UNIVERSITARIO EN INVESTIGACIÓN EN INGENIERIA Y ARQUITECTURA1613.0%
 
MÁSTER UNIVERSITARIO EN INGENIERÍA DE TELECOMUNICACIÓN1562.9%
 
MÁSTER UNIVERSITARIO EN INGENIERÍA INFORMÁTICA1021.9%
 
Other values (6)2154.0%
 
2021-05-14T11:06:06.717985image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-05-14T11:06:06.791536image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length74
Median length58
Mean length46.58470046
Min length20
Distinct14
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
2010-11
1101 
2011-12
652 
2013-14
539 
2009-10
513 
2012-13
495 
Other values (9)
2125 
ValueCountFrequency (%) 
2010-11110120.3%
 
2011-1265212.0%
 
2013-145399.9%
 
2009-105139.5%
 
2012-134959.1%
 
2014-154448.2%
 
2015-163115.7%
 
2016-172775.1%
 
2017-182765.1%
 
2018-192684.9%
 
Other values (4)54910.1%
 
2021-05-14T11:06:06.863797image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-05-14T11:06:06.935001image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length7
Median length7
Mean length7
Min length7

exp_cerrado
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
N
3192 
S
2233 
ValueCountFrequency (%) 
N319258.8%
 
S223341.2%
 
2021-05-14T11:06:06.995035image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-05-14T11:06:07.035705image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:07.077822image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

exp_trasladado
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
N
4525 
S
900 
ValueCountFrequency (%) 
N452583.4%
 
S90016.6%
 
2021-05-14T11:06:07.140018image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-05-14T11:06:07.183927image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:07.225908image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

tipo_traslado
Categorical

MISSING

Distinct3
Distinct (%)0.3%
Missing4525
Missing (%)83.4%
Memory size42.4 KiB
I
488 
S
219 
E
193 
ValueCountFrequency (%) 
I4889.0%
 
S2194.0%
 
E1933.6%
 
(Missing)452583.4%
 
2021-05-14T11:06:07.290273image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-05-14T11:06:07.336383image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:07.387180image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length2.668202765
Min length1

exp_bloqueado
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
N
5334 
S
 
91
ValueCountFrequency (%) 
N533498.3%
 
S911.7%
 
2021-05-14T11:06:07.461648image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-05-14T11:06:07.502815image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:07.550694image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1
Distinct49
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
2009-10
620 
2008-09
539 
2010-11
531 
2011-12
379 
2012-13
307 
Other values (44)
3049 
ValueCountFrequency (%) 
2009-1062011.4%
 
2008-095399.9%
 
2010-115319.8%
 
2011-123797.0%
 
2012-133075.7%
 
2007-082765.1%
 
2017-182765.1%
 
2013-142725.0%
 
2014-152574.7%
 
2015-162504.6%
 
Other values (39)171831.7%
 
2021-05-14T11:06:07.631011image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique3 ?
Unique (%)0.1%
2021-05-14T11:06:07.789227image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length7
Median length7
Mean length7
Min length7
Distinct17
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
JUN
3530 
SEP
1043 
EXT
 
288
FEB
 
253
OCT
 
90
Other values (12)
 
221
ValueCountFrequency (%) 
JUN353065.1%
 
SEP104319.2%
 
EXT2885.3%
 
FEB2534.7%
 
OCT901.7%
 
JUL831.5%
 
FEX430.8%
 
DIC390.7%
 
ENE260.5%
 
MAR100.2%
 
Other values (7)200.4%
 
2021-05-14T11:06:07.858248image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique4 ?
Unique (%)0.1%
2021-05-14T11:06:07.924244image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length3
Median length3
Mean length3
Min length3

acceso
Real number (ℝ≥0)

Distinct11
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.367926267
Minimum1
Maximum20
Zeros0
Zeros (%)0.0%
Memory size42.4 KiB
2021-05-14T11:06:07.976494image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q35
95-th percentile5
Maximum20
Range19
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.051944324
Coefficient of variation (CV)0.8665575245
Kurtosis10.12127567
Mean2.367926267
Median Absolute Deviation (MAD)0
Skewness2.117658595
Sum12846
Variance4.210475511
MonotocityNot monotonic
2021-05-14T11:06:08.033341image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%) 
1340862.8%
 
5138925.6%
 
35279.7%
 
10380.7%
 
6270.5%
 
4140.3%
 
1780.1%
 
2070.1%
 
230.1%
 
72< 0.1%
 
ValueCountFrequency (%) 
1340862.8%
 
230.1%
 
35279.7%
 
4140.3%
 
5138925.6%
 
ValueCountFrequency (%) 
2070.1%
 
1780.1%
 
10380.7%
 
92< 0.1%
 
72< 0.1%
 

des_acceso
Categorical

Distinct11
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
Selectividad
3408 
Título Universitario
1389 
Formación Profesional
527 
Traslado de Expediente (Estudios Españoles)
 
38
Acceso a Segundo Ciclo
 
27
Other values (6)
 
36
ValueCountFrequency (%) 
Selectividad340862.8%
 
Título Universitario138925.6%
 
Formación Profesional5279.7%
 
Traslado de Expediente (Estudios Españoles)380.7%
 
Acceso a Segundo Ciclo270.5%
 
Mayores de 25/40/45 años140.3%
 
Bachillerato Sin Prueba de Acceso80.1%
 
Título de Bachiller Homologado (Extranjeros)70.1%
 
COU sin selectividad30.1%
 
Estudios universitarios extranjeros parcialmente convalidados2< 0.1%
 
2021-05-14T11:06:08.108802image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-05-14T11:06:08.179952image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length61
Median length12
Mean length15.32921659
Min length12

sub_acceso
Real number (ℝ≥0)

Distinct7
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.375115207
Minimum1
Maximum99
Zeros0
Zeros (%)0.0%
Memory size42.4 KiB
2021-05-14T11:06:08.239183image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q35
95-th percentile6
Maximum99
Range98
Interquartile range (IQR)4

Descriptive statistics

Standard deviation11.0081061
Coefficient of variation (CV)2.51607228
Kurtosis67.54114125
Mean4.375115207
Median Absolute Deviation (MAD)1
Skewness8.188160732
Sum23735
Variance121.1783998
MonotocityNot monotonic
2021-05-14T11:06:08.294193image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%) 
1203237.5%
 
5176032.4%
 
284915.6%
 
671013.1%
 
99701.3%
 
430.1%
 
31< 0.1%
 
ValueCountFrequency (%) 
1203237.5%
 
284915.6%
 
31< 0.1%
 
430.1%
 
5176032.4%
 
ValueCountFrequency (%) 
99701.3%
 
671013.1%
 
5176032.4%
 
430.1%
 
31< 0.1%
 

des_subacesso
Categorical

Distinct19
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
LOE (Grados)
1760 
LOGSE
841 
Curso de Adaptación
733 
Bachillerato LOMCE
710 
Titulado universitario
656 
Other values (14)
725 
ValueCountFrequency (%) 
LOE (Grados)176032.4%
 
LOGSE84115.5%
 
Curso de Adaptación73313.5%
 
Bachillerato LOMCE71013.1%
 
Titulado universitario65612.1%
 
Ciclos formativos4458.2%
 
.1021.9%
 
Formación Profesional II751.4%
 
Prueba de Acceso a la Universidad350.6%
 
COU310.6%
 
Other values (9)370.7%
 
2021-05-14T11:06:08.364981image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique2 ?
Unique (%)< 0.1%
2021-05-14T11:06:08.445632image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length45
Median length12
Mean length14.45953917
Min length1

nota_acceso
Real number (ℝ≥0)

MISSING

Distinct1573
Distinct (%)40.9%
Missing1578
Missing (%)29.1%
Infinite0
Infinite (%)0.0%
Mean6.615799324
Minimum0
Maximum12.272
Zeros2
Zeros (%)< 0.1%
Memory size42.4 KiB
2021-05-14T11:06:08.515310image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile5.29
Q15.8255
median6.44
Q37.249
95-th percentile8.66
Maximum12.272
Range12.272
Interquartile range (IQR)1.4235

Descriptive statistics

Standard deviation1.056105224
Coefficient of variation (CV)0.1596338057
Kurtosis1.255282834
Mean6.615799324
Median Absolute Deviation (MAD)0.678
Skewness0.5318499533
Sum25450.98
Variance1.115358244
MonotocityNot monotonic
2021-05-14T11:06:08.598599image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
7280.5%
 
6.5230.4%
 
5.8220.4%
 
5.75210.4%
 
5.9200.4%
 
6180.3%
 
6.25160.3%
 
5.6150.3%
 
6.69140.3%
 
5.65140.3%
 
Other values (1563)365667.4%
 
(Missing)157829.1%
 
ValueCountFrequency (%) 
02< 0.1%
 
1.3691< 0.1%
 
1.5591< 0.1%
 
1.7191< 0.1%
 
2.0171< 0.1%
 
ValueCountFrequency (%) 
12.2721< 0.1%
 
102< 0.1%
 
9.91< 0.1%
 
9.8591< 0.1%
 
9.81< 0.1%
 

nota_admision_def
Real number (ℝ≥0)

MISSING

Distinct1615
Distinct (%)59.8%
Missing2725
Missing (%)50.2%
Infinite0
Infinite (%)0.0%
Mean7.702148889
Minimum5
Maximum13.859
Zeros0
Zeros (%)0.0%
Memory size42.4 KiB
2021-05-14T11:06:08.681135image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum5
5-th percentile5.3419
Q16.1285
median7.198
Q38.91125
95-th percentile11.75615
Maximum13.859
Range8.859
Interquartile range (IQR)2.78275

Descriptive statistics

Standard deviation1.987739052
Coefficient of variation (CV)0.258075906
Kurtosis-0.02239247871
Mean7.702148889
Median Absolute Deviation (MAD)1.268
Skewness0.8771500269
Sum20795.802
Variance3.95110654
MonotocityNot monotonic
2021-05-14T11:06:08.758990image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
7110.2%
 
6.54100.2%
 
5.990.2%
 
6.4580.1%
 
5.680.1%
 
5.8580.1%
 
7.1580.1%
 
6.2380.1%
 
5.9280.1%
 
5.880.1%
 
Other values (1605)261448.2%
 
(Missing)272550.2%
 
ValueCountFrequency (%) 
540.1%
 
5.0041< 0.1%
 
5.0182< 0.1%
 
5.0230.1%
 
5.02430.1%
 
ValueCountFrequency (%) 
13.8591< 0.1%
 
13.81< 0.1%
 
13.6731< 0.1%
 
13.6141< 0.1%
 
13.5931< 0.1%
 

centro_escolar_acceso
Categorical

HIGH CARDINALITY
MISSING

Distinct137
Distinct (%)4.6%
Missing2461
Missing (%)45.4%
Memory size42.4 KiB
231-I.E.S. NORBA CAESARINA
 
185
230-I.E.S. EL BROCENSE
 
150
220-I.E.S. UNIVERSIDAD LABORAL
 
103
232-I.E.S. PROFESOR HERNÁNDEZ PACHECO
 
79
170-I.E.S. SANTA EULALIA
 
69
Other values (132)
2378 
ValueCountFrequency (%) 
231-I.E.S. NORBA CAESARINA1853.4%
 
230-I.E.S. EL BROCENSE1502.8%
 
220-I.E.S. UNIVERSIDAD LABORAL1031.9%
 
232-I.E.S. PROFESOR HERNÁNDEZ PACHECO791.5%
 
170-I.E.S. SANTA EULALIA691.3%
 
203-I.E.S. ALAGÓN631.2%
 
240-I.E.S. ÁGORA561.0%
 
210-I.E.S. LUIS DE MORALES521.0%
 
238-COLEGIO LICENCIADOS REUNIDOS500.9%
 
234-COLEGIO SAN ANTONIO DE PADUA470.9%
 
Other values (127)211038.9%
 
(Missing)246145.4%
 
2021-05-14T11:06:08.850010image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique6 ?
Unique (%)0.2%
2021-05-14T11:06:08.934451image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length40
Median length19
Mean length16.0764977
Min length3

sexo
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
H
4311 
D
1114 
ValueCountFrequency (%) 
H431179.5%
 
D111420.5%
 
2021-05-14T11:06:09.012390image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2021-05-14T11:06:09.052685image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:09.092799image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

fecha_nacimiento
Categorical

HIGH CARDINALITY
UNIFORM

Distinct3690
Distinct (%)68.0%
Missing0
Missing (%)0.0%
Memory size42.4 KiB
1992-06-13
 
9
1990-10-15
 
8
1992-02-04
 
8
1993-10-24
 
7
1992-03-30
 
7
Other values (3685)
5386 
ValueCountFrequency (%) 
1992-06-1390.2%
 
1990-10-1580.1%
 
1992-02-0480.1%
 
1993-10-2470.1%
 
1992-03-3070.1%
 
1991-07-0370.1%
 
1992-12-1870.1%
 
1992-04-2270.1%
 
1997-12-0460.1%
 
1992-12-2860.1%
 
Other values (3680)535398.7%
 
2021-05-14T11:06:09.251055image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique2498 ?
Unique (%)46.0%
2021-05-14T11:06:09.326297image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length10
Median length10
Mean length10
Min length10

cod_provincia
Real number (ℝ≥0)

MISSING

Distinct50
Distinct (%)1.0%
Missing162
Missing (%)3.0%
Infinite0
Infinite (%)0.0%
Mean10.67832035
Minimum0
Maximum60
Zeros8
Zeros (%)0.1%
Memory size42.4 KiB
2021-05-14T11:06:09.391185image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile6
Q16
median10
Q310
95-th percentile31.9
Maximum60
Range60
Interquartile range (IQR)4

Descriptive statistics

Standard deviation8.64053009
Coefficient of variation (CV)0.8091656559
Kurtosis9.667099711
Mean10.67832035
Median Absolute Deviation (MAD)4
Skewness3.098695934
Sum56200
Variance74.65876023
MonotocityNot monotonic
2021-05-14T11:06:09.468350image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
10240144.3%
 
6217240.0%
 
281382.5%
 
45541.0%
 
37510.9%
 
8480.9%
 
11440.8%
 
41410.8%
 
20250.5%
 
21240.4%
 
Other values (40)2654.9%
 
(Missing)1623.0%
 
ValueCountFrequency (%) 
080.1%
 
130.1%
 
21< 0.1%
 
390.2%
 
450.1%
 
ValueCountFrequency (%) 
60140.3%
 
521< 0.1%
 
5160.1%
 
5050.1%
 
4960.1%
 

provincia
Categorical

MISSING

Distinct50
Distinct (%)1.0%
Missing162
Missing (%)3.0%
Memory size42.4 KiB
CÁCERES
2401 
BADAJOZ
2172 
MADRID
 
138
TOLEDO
 
54
SALAMANCA
 
51
Other values (45)
447 
ValueCountFrequency (%) 
CÁCERES240144.3%
 
BADAJOZ217240.0%
 
MADRID1382.5%
 
TOLEDO541.0%
 
SALAMANCA510.9%
 
BARCELONA480.9%
 
CÁDIZ440.8%
 
SEVILLA410.8%
 
GIPUZKOA250.5%
 
HUELVA240.4%
 
Other values (40)2654.9%
 
(Missing)1623.0%
 
2021-05-14T11:06:09.556556image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique4 ?
Unique (%)0.1%
2021-05-14T11:06:09.633337image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length22
Median length7
Mean length6.900276498
Min length3

cod_municipio
Real number (ℝ≥0)

MISSING

Distinct278
Distinct (%)5.3%
Missing179
Missing (%)3.3%
Infinite0
Infinite (%)0.0%
Mean233.7384674
Minimum1
Maximum912
Zeros0
Zeros (%)0.0%
Memory size42.4 KiB
2021-05-14T11:06:09.730912image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median90
Q3450
95-th percentile760
Maximum912
Range911
Interquartile range (IQR)449

Descriptive statistics

Standard deviation265.2541775
Coefficient of variation (CV)1.13483322
Kurtosis-0.9772410625
Mean233.7384674
Median Absolute Deviation (MAD)89
Skewness0.6763307704
Sum1226192
Variance70359.77868
MonotocityNot monotonic
2021-05-14T11:06:09.809293image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1236143.5%
 
5843085.7%
 
4103065.6%
 
2151703.1%
 
5161252.3%
 
2641041.9%
 
7601021.9%
 
55621.1%
 
435531.0%
 
785440.8%
 
Other values (268)161129.7%
 
(Missing)1793.3%
 
ValueCountFrequency (%) 
1236143.5%
 
51< 0.1%
 
82< 0.1%
 
10150.3%
 
1230.1%
 
ValueCountFrequency (%) 
91230.1%
 
8881< 0.1%
 
8841< 0.1%
 
8752< 0.1%
 
86830.1%
 

municipio
Categorical

HIGH CARDINALITY
MISSING

Distinct435
Distinct (%)8.3%
Missing179
Missing (%)3.3%
Memory size42.4 KiB
CÁCERES
1365 
BADAJOZ
606 
PLASENCIA
308 
MÉRIDA
306 
DON BENITO
 
170
Other values (430)
2491 
ValueCountFrequency (%) 
CÁCERES136525.2%
 
BADAJOZ60611.2%
 
PLASENCIA3085.7%
 
MÉRIDA3065.6%
 
DON BENITO1703.1%
 
NAVALMORAL DE LA MATA1252.3%
 
MADRID1082.0%
 
CORIA1041.9%
 
VILLANUEVA DE LA SERENA1011.9%
 
ALMENDRALEJO621.1%
 
Other values (425)199136.7%
 
(Missing)1793.3%
 
2021-05-14T11:06:09.902286image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique170 ?
Unique (%)3.2%
2021-05-14T11:06:09.986042image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length26
Median length7
Mean length9.693271889
Min length3

Interactions

2021-05-14T11:06:00.244997image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:00.325423image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:00.396815image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:00.474689image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:00.546766image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:00.618620image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:00.689391image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:00.761428image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:00.831679image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:00.900883image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:01.105319image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:01.298485image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:01.407044image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:01.508823image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:01.577946image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:01.651067image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:01.743334image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:01.853523image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:01.931083image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:02.011271image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:02.086101image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:02.161525image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:02.235523image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:02.320298image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:02.404553image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:02.475646image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:02.542351image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:02.614844image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:02.682050image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:02.748864image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:02.814430image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:02.881379image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:02.946707image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:03.016899image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:03.082919image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:03.155358image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:03.222117image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:03.288867image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:03.354883image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:03.423243image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:03.571723image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:03.640901image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:03.705759image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:03.776957image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:03.842855image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:03.926025image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:03.995268image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:04.064173image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:04.129701image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:04.200954image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:04.269225image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:04.356225image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:04.425216image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:04.495729image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:04.562408image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:04.644897image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:04.731848image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:04.819110image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:04.884137image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:04.959021image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:05.033088image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:05.107158image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:05.171432image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:05.238689image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Correlations

2021-05-14T11:06:10.045908image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-05-14T11:06:10.147121image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-05-14T11:06:10.246278image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-05-14T11:06:10.360646image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-05-14T11:06:10.512211image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-05-14T11:06:05.398071image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:05.840047image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:06.145168image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-14T11:06:06.260430image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Sample

First rows

expedientecod_plandes_plananio_apertura_expedienteexp_cerradoexp_trasladadotipo_trasladoexp_bloqueadoanio_convocatoria_accesoconvocatoria_accesoaccesodes_accesosub_accesodes_subacessonota_accesonota_admision_defcentro_escolar_accesosexofecha_nacimientocod_provinciaprovinciacod_municipiomunicipio
021621MÁSTER EN COMPUTACIÓN GRID Y PARALELISMO2007-08SNNaNN2004-05SEP6Acceso a Segundo Ciclo1.NaNNaNNaNH1981-10-2310.0CÁCERES1.0CÁCERES
131621MÁSTER EN COMPUTACIÓN GRID Y PARALELISMO2007-08SNNaNN2000-01DIC6Acceso a Segundo Ciclo1.NaNNaNNaNH1977-07-0910.0CÁCERES1.0CÁCERES
251621MÁSTER EN COMPUTACIÓN GRID Y PARALELISMO2007-08SNNaNN1994-95FEB6Acceso a Segundo Ciclo1.NaNNaNNaND1970-08-266.0BADAJOZ345.0JEREZ DE LOS CABALLEROS
361621MÁSTER EN COMPUTACIÓN GRID Y PARALELISMO2007-08NNNaNN2003-04JUN6Acceso a Segundo Ciclo1.NaNNaNNaNH1978-11-166.0BADAJOZ740.0VILLAFRANCA DE LOS BARROS
471621MÁSTER EN COMPUTACIÓN GRID Y PARALELISMO2007-08SNNaNN1996-97OCT6Acceso a Segundo Ciclo1.NaNNaNNaNH1971-11-016.0BADAJOZ1.0BADAJOZ
581621MÁSTER EN COMPUTACIÓN GRID Y PARALELISMO2007-08SNNaNN2005-06SEP6Acceso a Segundo Ciclo1.NaNNaNNaNH1984-04-2610.0CÁCERES1.0CÁCERES
691621MÁSTER EN COMPUTACIÓN GRID Y PARALELISMO2007-08SNNaNN2006-07DIC6Acceso a Segundo Ciclo1.NaNNaNNaNH1984-05-306.0BADAJOZ760.0VILLANUEVA DE LA SERENA
7111621MÁSTER EN COMPUTACIÓN GRID Y PARALELISMO2007-08NNNaNN2006-07DIC6Acceso a Segundo Ciclo1.NaNNaNNaNH1979-06-2117.0GIRONA252.0FIGUERES
8121621MÁSTER EN COMPUTACIÓN GRID Y PARALELISMO2007-08SNNaNN1989-90JUN6Acceso a Segundo Ciclo1.NaNNaNNaNH1965-12-056.0BADAJOZ410.0MÉRIDA
9131621MÁSTER EN COMPUTACIÓN GRID Y PARALELISMO2007-08SNNaNN1987-88FEB6Acceso a Segundo Ciclo1.NaNNaNNaNH1963-12-1460.0EXTRANJEROSNaNNaN

Last rows

expedientecod_plandes_plananio_apertura_expedienteexp_cerradoexp_trasladadotipo_trasladoexp_bloqueadoanio_convocatoria_accesoconvocatoria_accesoaccesodes_accesosub_accesodes_subacessonota_accesonota_admision_defcentro_escolar_accesosexofecha_nacimientocod_provinciaprovinciacod_municipiomunicipio
54151181639MÁSTER UNIV. EN METODOLOGÍA BIM EN EL DESARROLLO COLABORATIVO DE PROYECTOS2020-21NNNaNN1992-93JUN5Título Universitario1Titulado universitarioNaNNaNNaNH1970-02-1810.0CÁCERES256.0COLLADO DE LA VERA
54161201639MÁSTER UNIV. EN METODOLOGÍA BIM EN EL DESARROLLO COLABORATIVO DE PROYECTOS2020-21NNNaNN2016-17JUL5Título Universitario1Titulado universitarioNaNNaNNaNH1990-09-033.0ALICANTE310.0DÉNIA
54171211639MÁSTER UNIV. EN METODOLOGÍA BIM EN EL DESARROLLO COLABORATIVO DE PROYECTOS2020-21NNNaNN2018-19SEP5Título Universitario1Titulado universitarioNaNNaNNaNH1991-08-136.0BADAJOZ450.0NAVALVILLAR DE PELA
54181241639MÁSTER UNIV. EN METODOLOGÍA BIM EN EL DESARROLLO COLABORATIVO DE PROYECTOS2020-21NNNaNN2019-20JUN5Título Universitario1Titulado universitarioNaNNaNNaND1994-10-2610.0CÁCERES444.0MADROÑERA
54191251639MÁSTER UNIV. EN METODOLOGÍA BIM EN EL DESARROLLO COLABORATIVO DE PROYECTOS2020-21NNNaNN2018-19JUL5Título Universitario1Titulado universitarioNaNNaNNaNH1993-03-2310.0CÁCERES772.0TRUJILLO
54201271639MÁSTER UNIV. EN METODOLOGÍA BIM EN EL DESARROLLO COLABORATIVO DE PROYECTOS2020-21NNNaNN2018-19SEP5Título Universitario1Titulado universitarioNaNNaNNaND1996-11-116.0BADAJOZ740.0VILLAFRANCA DE LOS BARROS
54211281639MÁSTER UNIV. EN METODOLOGÍA BIM EN EL DESARROLLO COLABORATIVO DE PROYECTOS2020-21NNNaNN2019-20JUL5Título Universitario1Titulado universitarioNaNNaNNaNH1996-02-1310.0CÁCERES1.0CÁCERES
54221291639MÁSTER UNIV. EN METODOLOGÍA BIM EN EL DESARROLLO COLABORATIVO DE PROYECTOS2020-21NNNaNN2019-20SEP5Título Universitario1Titulado universitarioNaNNaNNaNH1992-05-106.0BADAJOZ135.0CAMPANARIO
54231301639MÁSTER UNIV. EN METODOLOGÍA BIM EN EL DESARROLLO COLABORATIVO DE PROYECTOS2020-21NNNaNN2020-21NOV5Título Universitario1Titulado universitarioNaNNaNNaND1997-04-1110.0CÁCERES700.0SIERRA DE FUENTES
54241311639MÁSTER UNIV. EN METODOLOGÍA BIM EN EL DESARROLLO COLABORATIVO DE PROYECTOS2020-21NNNaNN2020-21NOV5Título Universitario1Titulado universitarioNaNNaNNaNH1992-06-136.0BADAJOZ265.0FUENTE DEL MAESTRE